#load needed packages. make sure they are installed.
pacman::p_load(here, knitr, tidyverse, skimr, fpp2, tigris, plotly, viridis, transformr, htmlwidgets, gt, gtExtras, widgetframe)
theme_set(theme_minimal())An example exploratory analysis script
Setup
Load the data.
d1 <- readRDS(here('data','processed-data','processed-crime.rds'))
tig_zips <- zctas(cb=TRUE, starts_with = c(unique(d1$Zip.Code)), year = 2020)ZCTAs can take several minutes to download. To cache the data and avoid re-downloading in future R sessions, set `options(tigris_use_cache = TRUE)`
|
| | 0%
|
| | 1%
|
|= | 1%
|
|= | 2%
|
|== | 3%
|
|=== | 4%
|
|=== | 5%
|
|==== | 5%
|
|==== | 6%
|
|===== | 7%
|
|===== | 8%
|
|======= | 10%
|
|======= | 11%
|
|======== | 11%
|
|======== | 12%
|
|========= | 12%
|
|========= | 13%
|
|========== | 14%
|
|========== | 15%
|
|=========== | 15%
|
|=========== | 16%
|
|============ | 16%
|
|============ | 17%
|
|============ | 18%
|
|============= | 18%
|
|============= | 19%
|
|============== | 19%
|
|============== | 20%
|
|============== | 21%
|
|=============== | 21%
|
|=============== | 22%
|
|================ | 22%
|
|================ | 23%
|
|================ | 24%
|
|================= | 24%
|
|================= | 25%
|
|================== | 25%
|
|================== | 26%
|
|=================== | 26%
|
|=================== | 27%
|
|=================== | 28%
|
|==================== | 28%
|
|==================== | 29%
|
|===================== | 29%
|
|===================== | 30%
|
|===================== | 31%
|
|====================== | 31%
|
|====================== | 32%
|
|======================= | 32%
|
|======================= | 33%
|
|======================= | 34%
|
|======================== | 34%
|
|======================== | 35%
|
|========================= | 35%
|
|========================= | 36%
|
|========================== | 36%
|
|========================== | 37%
|
|=========================== | 38%
|
|=========================== | 39%
|
|============================ | 40%
|
|============================= | 41%
|
|============================= | 42%
|
|============================== | 42%
|
|============================== | 43%
|
|============================== | 44%
|
|=============================== | 44%
|
|=============================== | 45%
|
|================================ | 45%
|
|================================ | 46%
|
|================================= | 47%
|
|================================= | 48%
|
|================================== | 48%
|
|================================== | 49%
|
|=================================== | 49%
|
|=================================== | 50%
|
|=================================== | 51%
|
|==================================== | 51%
|
|==================================== | 52%
|
|===================================== | 52%
|
|===================================== | 53%
|
|===================================== | 54%
|
|====================================== | 54%
|
|====================================== | 55%
|
|======================================= | 55%
|
|======================================= | 56%
|
|======================================== | 56%
|
|======================================== | 57%
|
|========================================= | 58%
|
|========================================= | 59%
|
|========================================== | 59%
|
|========================================== | 60%
|
|========================================== | 61%
|
|=========================================== | 61%
|
|=========================================== | 62%
|
|============================================ | 62%
|
|============================================ | 63%
|
|============================================= | 64%
|
|============================================= | 65%
|
|============================================== | 65%
|
|============================================== | 66%
|
|=============================================== | 66%
|
|=============================================== | 67%
|
|=============================================== | 68%
|
|================================================ | 68%
|
|================================================ | 69%
|
|================================================= | 69%
|
|================================================= | 70%
|
|================================================= | 71%
|
|================================================== | 71%
|
|================================================== | 72%
|
|=================================================== | 72%
|
|=================================================== | 73%
|
|=================================================== | 74%
|
|==================================================== | 74%
|
|==================================================== | 75%
|
|===================================================== | 75%
|
|===================================================== | 76%
|
|====================================================== | 76%
|
|====================================================== | 77%
|
|====================================================== | 78%
|
|======================================================= | 78%
|
|======================================================= | 79%
|
|======================================================== | 79%
|
|======================================================== | 80%
|
|======================================================== | 81%
|
|========================================================= | 81%
|
|========================================================== | 83%
|
|=========================================================== | 85%
|
|============================================================ | 85%
|
|============================================================ | 86%
|
|============================================================= | 86%
|
|============================================================= | 87%
|
|============================================================= | 88%
|
|============================================================== | 88%
|
|============================================================== | 89%
|
|=============================================================== | 89%
|
|=============================================================== | 90%
|
|=============================================================== | 91%
|
|================================================================ | 91%
|
|================================================================ | 92%
|
|================================================================= | 92%
|
|================================================================= | 93%
|
|================================================================== | 94%
|
|================================================================== | 95%
|
|=================================================================== | 95%
|
|=================================================================== | 96%
|
|==================================================================== | 96%
|
|==================================================================== | 97%
|
|==================================================================== | 98%
|
|===================================================================== | 98%
|
|===================================================================== | 99%
|
|======================================================================| 99%
|
|======================================================================| 100%
Data exploration through tables
Showing a bit of code to produce and save a summary table.
summary(d1) Incident.Number Highest.Offense.Description Highest.Offense.Code
Min. :2.004e+04 Length:2461621 Min. : 100
1st Qu.:2.005e+10 Class :character 1st Qu.: 601
Median :2.011e+10 Mode :character Median :1199
Mean :6.032e+10 Mean :1689
3rd Qu.:2.017e+10 3rd Qu.:2716
Max. :2.024e+12 Max. :8905
Family.Violence Occurred.Date.Time Occurred.Date
Length:2461621 Min. :2003-01-01 00:00:00.00 Min. :2003-01-01
Class :character 1st Qu.:2007-11-16 16:49:00.00 1st Qu.:2007-11-16
Mode :character Median :2012-05-28 23:09:00.00 Median :2012-05-28
Mean :2012-11-25 19:50:28.18 Mean :2012-11-25
3rd Qu.:2017-10-26 21:19:00.00 3rd Qu.:2017-10-26
Max. :2024-06-01 23:46:00.00 Max. :2024-06-01
Occurred.Time Report.Date.Time Report.Date
Length:2461621 Min. :2002-11-29 05:30:00.00 Min. :2002-11-29
Class1:hms 1st Qu.:2007-11-27 22:41:00.00 1st Qu.:2007-11-27
Class2:difftime Median :2012-06-06 11:15:00.00 Median :2012-06-06
Mode :numeric Mean :2012-12-04 16:45:39.02 Mean :2012-12-04
3rd Qu.:2017-11-05 01:56:00.00 3rd Qu.:2017-11-05
Max. :2024-06-02 01:20:00.00 Max. :2024-06-02
Report.Time Location.Type Address Zip.Code
Length:2461621 Length:2461621 Length:2461621 Min. :76574
Class1:hms Class :character Class :character 1st Qu.:78717
Class2:difftime Mode :character Mode :character Median :78741
Mode :numeric Mean :78732
3rd Qu.:78752
Max. :78759
Council.District APD.Sector APD.District PRA
Min. : 1.000 Length:2461621 Length:2461621 Length:2461621
1st Qu.: 3.000 Class :character Class :character Class :character
Median : 4.000 Mode :character Mode :character Mode :character
Mean : 4.965
3rd Qu.: 7.000
Max. :10.000
NA's :30699
Census.Tract Clearance.Status Clearance.Date UCR.Category
Min. : 1.0 Length:2461621 Min. :2003-01-01 Length:2461621
1st Qu.: 15.0 Class :character 1st Qu.:2008-04-07 Class :character
Median : 23.2 Mode :character Median :2012-10-17 Mode :character
Mean : 245.4 Mean :2013-03-14
3rd Qu.: 338.0 3rd Qu.:2018-01-19
Max. :950800.0 Max. :2024-06-02
NA's :8822 NA's :348308
Category.Description X.coordinate Y.coordinate Latitude
Length:2461621 Min. : 0 Min. : 0 Min. :30.01
Class :character 1st Qu.:3108421 1st Qu.:10057433 1st Qu.:30.23
Mode :character Median :3117292 Median :10073004 Median :30.28
Mean :3075787 Mean : 9946761 Mean :30.29
3rd Qu.:3126595 3rd Qu.:10100561 3rd Qu.:30.35
Max. :3231806 Max. :10215496 Max. :30.67
NA's :32335
Longitude Location Crime.Category
Min. :-98.18 Length:2461621 Length:2461621
1st Qu.:-97.76 Class :character Class :character
Median :-97.73 Mode :character Mode :character
Mean :-97.73
3rd Qu.:-97.70
Max. :-97.37
NA's :32335
head(d1) Incident.Number Highest.Offense.Description Highest.Offense.Code
1 2013851154 SEXUAL ASSAULT OF CHILD/OBJECT 1707
2 20161800084 RAPE OF A CHILD 204
3 2010701921 RAPE 200
4 20071820003 RAPE 200
5 20062192048 SEXUAL ASSAULT W/ OBJECT 1700
6 20033211543 RAPE 200
Family.Violence Occurred.Date.Time Occurred.Date Occurred.Time
1 Y 2009-01-01 00:01:00 2009-01-01 00:01:00
2 Y 2016-06-28 01:05:00 2016-06-28 01:05:00
3 Y 2010-03-04 19:15:00 2010-03-04 19:15:00
4 N 2007-07-01 12:00:00 2007-07-01 12:00:00
5 N 2006-08-07 22:28:00 2006-08-07 22:28:00
6 Y 2003-11-17 14:00:00 2003-11-17 14:00:00
Report.Date.Time Report.Date Report.Time Location.Type
1 2013-03-26 16:56:00 2013-03-26 16:56:00 RESIDENCE / HOME
2 2016-06-28 01:05:00 2016-06-28 01:05:00 RESIDENCE / HOME
3 2010-03-11 17:06:00 2010-03-11 17:06:00 RESIDENCE / HOME
4 2007-07-01 12:00:00 2007-07-01 12:00:00 RESIDENCE / HOME
5 2006-08-07 22:28:00 2006-08-07 22:28:00 RESIDENCE / HOME
6 2003-11-17 21:40:00 2003-11-17 21:40:00 RESIDENCE / HOME
Address Zip.Code Council.District APD.Sector APD.District
1 900 BLOCK E 32ND ST 78705 9 BA 1
2 6900 BLOCK BRANCHWOOD DR 78744 2 FR 8
3 400 BLOCK ANGEL OAK ST 78748 5 FR 2
4 1700 BLOCK WOOTEN DR 78757 7 ID 7
5 500 BLOCK E OLTORF ST 78704 9 DA 2
6 7300 BLOCK DANJEAN DR 78745 NA DA 6
PRA Census.Tract Clearance.Status Clearance.Date UCR.Category
1 348 4.00 C 2013-04-11 11C
2 530 24.41 C 2016-07-01 11A
3 542 24.38 C 2010-03-18 11A
4 247 405.00 O 2007-08-02 11A
5 479 23.23 2006-08-22 11C
6 525 1728.00 O 2003-11-30 11A
Category.Description X.coordinate Y.coordinate Latitude Longitude Location
1 Rape 0 0 NA NA
2 Rape 0 0 NA NA
3 Rape 0 0 NA NA
4 Rape 0 0 NA NA
5 Rape 0 0 NA NA
6 Rape 0 0 NA NA
Crime.Category
1 RAPE
2 RAPE
3 RAPE
4 RAPE
5 RAPE
6 RAPE
str(d1)'data.frame': 2461621 obs. of 28 variables:
$ Incident.Number : num 2.01e+09 2.02e+10 2.01e+09 2.01e+10 2.01e+10 ...
$ Highest.Offense.Description: chr "SEXUAL ASSAULT OF CHILD/OBJECT" "RAPE OF A CHILD" "RAPE" "RAPE" ...
$ Highest.Offense.Code : int 1707 204 200 200 1700 200 902 2703 200 2006 ...
$ Family.Violence : chr "Y" "Y" "Y" "N" ...
$ Occurred.Date.Time : POSIXct, format: "2009-01-01 00:01:00" "2016-06-28 01:05:00" ...
$ Occurred.Date : Date, format: "2009-01-01" "2016-06-28" ...
$ Occurred.Time : 'hms' num 00:01:00 01:05:00 19:15:00 12:00:00 ...
..- attr(*, "units")= chr "secs"
$ Report.Date.Time : POSIXct, format: "2013-03-26 16:56:00" "2016-06-28 01:05:00" ...
$ Report.Date : Date, format: "2013-03-26" "2016-06-28" ...
$ Report.Time : 'hms' num 16:56:00 01:05:00 17:06:00 12:00:00 ...
..- attr(*, "units")= chr "secs"
$ Location.Type : chr "RESIDENCE / HOME" "RESIDENCE / HOME" "RESIDENCE / HOME" "RESIDENCE / HOME" ...
$ Address : chr "900 BLOCK E 32ND ST" "6900 BLOCK BRANCHWOOD DR" "400 BLOCK ANGEL OAK ST" "1700 BLOCK WOOTEN DR" ...
$ Zip.Code : int 78705 78744 78748 78757 78704 78745 78702 78759 78705 78741 ...
$ Council.District : int 9 2 5 7 9 NA 3 10 9 3 ...
$ APD.Sector : chr "BA" "FR" "FR" "ID" ...
$ APD.District : chr "1" "8" "2" "7" ...
$ PRA : chr "348" "530" "542" "247" ...
$ Census.Tract : num 4 24.4 24.4 405 23.2 ...
$ Clearance.Status : chr "C" "C" "C" "O" ...
$ Clearance.Date : Date, format: "2013-04-11" "2016-07-01" ...
$ UCR.Category : chr "11C" "11A" "11A" "11A" ...
$ Category.Description : chr "Rape" "Rape" "Rape" "Rape" ...
$ X.coordinate : int 0 0 0 0 0 0 0 0 0 0 ...
$ Y.coordinate : int 0 0 0 0 0 0 0 0 0 0 ...
$ Latitude : num NA NA NA NA NA NA NA NA NA NA ...
$ Longitude : num NA NA NA NA NA NA NA NA NA NA ...
$ Location : chr "" "" "" "" ...
$ Crime.Category : chr "RAPE" "RAPE" "RAPE" "RAPE" ...
skim(d1)| Name | d1 |
| Number of rows | 2461621 |
| Number of columns | 28 |
| _______________________ | |
| Column type frequency: | |
| character | 12 |
| Date | 3 |
| difftime | 2 |
| numeric | 9 |
| POSIXct | 2 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| Highest.Offense.Description | 0 | 1 | 3 | 48 | 0 | 436 | 0 |
| Family.Violence | 0 | 1 | 1 | 1 | 0 | 2 | 0 |
| Location.Type | 0 | 1 | 7 | 47 | 0 | 47 | 0 |
| Address | 0 | 1 | 8 | 74 | 0 | 246951 | 0 |
| APD.Sector | 0 | 1 | 2 | 5 | 0 | 14 | 0 |
| APD.District | 0 | 1 | 1 | 2 | 0 | 21 | 0 |
| PRA | 0 | 1 | 1 | 4 | 0 | 742 | 0 |
| Clearance.Status | 0 | 1 | 0 | 1 | 615856 | 4 | 0 |
| UCR.Category | 0 | 1 | 0 | 3 | 1550375 | 17 | 0 |
| Category.Description | 0 | 1 | 0 | 18 | 1550375 | 8 | 0 |
| Location | 0 | 1 | 0 | 27 | 32335 | 219842 | 0 |
| Crime.Category | 0 | 1 | 4 | 29 | 0 | 33 | 0 |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| Occurred.Date | 0 | 1.00 | 2003-01-01 | 2024-06-01 | 2012-05-28 | 7823 |
| Report.Date | 0 | 1.00 | 2002-11-29 | 2024-06-02 | 2012-06-06 | 7825 |
| Clearance.Date | 348308 | 0.86 | 2003-01-01 | 2024-06-02 | 2012-10-17 | 7814 |
Variable type: difftime
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| Occurred.Time | 0 | 1 | 0 secs | 86340 secs | 14:25:00 | 1440 |
| Report.Time | 0 | 1 | 0 secs | 86340 secs | 14:06:00 | 1440 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Incident.Number | 0 | 1.00 | 6.031558e+10 | 2.896224e+11 | 20035.00 | 2.005329e+10 | 2.010505e+10 | 2.017186e+10 | 2.024242e+12 | ▇▁▁▁▁ |
| Highest.Offense.Code | 0 | 1.00 | 1.689080e+03 | 1.218280e+03 | 100.00 | 6.010000e+02 | 1.199000e+03 | 2.716000e+03 | 8.905000e+03 | ▇▅▁▁▁ |
| Zip.Code | 0 | 1.00 | 7.873243e+04 | 2.510000e+01 | 76574.00 | 7.871700e+04 | 7.874100e+04 | 7.875200e+04 | 7.875900e+04 | ▁▁▁▁▇ |
| Council.District | 30699 | 0.99 | 4.960000e+00 | 2.840000e+00 | 1.00 | 3.000000e+00 | 4.000000e+00 | 7.000000e+00 | 1.000000e+01 | ▅▇▃▃▅ |
| Census.Tract | 8822 | 1.00 | 2.453700e+02 | 3.363970e+03 | 1.00 | 1.500000e+01 | 2.324000e+01 | 3.380000e+02 | 9.508000e+05 | ▇▁▁▁▁ |
| X.coordinate | 0 | 1.00 | 3.075787e+06 | 3.551571e+05 | 0.00 | 3.108421e+06 | 3.117292e+06 | 3.126595e+06 | 3.231806e+06 | ▁▁▁▁▇ |
| Y.coordinate | 0 | 1.00 | 9.946761e+06 | 1.147895e+06 | 0.00 | 1.005743e+07 | 1.007300e+07 | 1.010056e+07 | 1.021550e+07 | ▁▁▁▁▇ |
| Latitude | 32335 | 0.99 | 3.029000e+01 | 8.000000e-02 | 30.01 | 3.023000e+01 | 3.028000e+01 | 3.035000e+01 | 3.067000e+01 | ▁▇▇▂▁ |
| Longitude | 32335 | 0.99 | -9.773000e+01 | 5.000000e-02 | -98.18 | -9.776000e+01 | -9.773000e+01 | -9.770000e+01 | -9.737000e+01 | ▁▁▇▂▁ |
Variable type: POSIXct
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| Occurred.Date.Time | 0 | 1 | 2003-01-01 00:00:00 | 2024-06-01 23:46:00 | 2012-05-28 23:09:00 | 1738386 |
| Report.Date.Time | 0 | 1 | 2002-11-29 05:30:00 | 2024-06-02 01:20:00 | 2012-06-06 11:15:00 | 2169726 |
summary(tig_zips) ZCTA5CE20 AFFGEOID20 GEOID20 NAME20
Length:68 Length:68 Length:68 Length:68
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
LSAD20 ALAND20 AWATER20 geometry
Length:68 Min. : 1475441 Min. : 0 MULTIPOLYGON :68
Class :character 1st Qu.: 22687804 1st Qu.: 0 epsg:4269 : 0
Mode :character Median : 41116414 Median : 257508 +proj=long...: 0
Mean : 98762692 Mean : 1867545
3rd Qu.:110644620 3rd Qu.: 1117435
Max. :521452503 Max. :26097562
head(tig_zips)Simple feature collection with 6 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -98.27747 ymin: 30.02595 xmax: -97.21823 ymax: 30.49132
Geodetic CRS: NAD83
ZCTA5CE20 AFFGEOID20 GEOID20 NAME20 LSAD20 ALAND20 AWATER20
477 78653 860Z200US78653 78653 78653 Z5 272611245 872369
1478 78610 860Z200US78610 78610 78610 Z5 250909063 918933
1884 78722 860Z200US78722 78722 78722 Z5 3510327 0
5355 78757 860Z200US78757 78757 78757 Z5 12823683 0
6063 78620 860Z200US78620 78620 78620 Z5 454800649 1116065
6064 78621 860Z200US78621 78621 78621 Z5 460284787 2487757
geometry
477 MULTIPOLYGON (((-97.61061 3...
1478 MULTIPOLYGON (((-98.01582 3...
1884 MULTIPOLYGON (((-97.7274 30...
5355 MULTIPOLYGON (((-97.75528 3...
6063 MULTIPOLYGON (((-98.27626 3...
6064 MULTIPOLYGON (((-97.50169 3...
str(tig_zips)Classes 'sf' and 'data.frame': 68 obs. of 8 variables:
$ ZCTA5CE20 : chr "78653" "78610" "78722" "78757" ...
$ AFFGEOID20: chr "860Z200US78653" "860Z200US78610" "860Z200US78722" "860Z200US78757" ...
$ GEOID20 : chr "78653" "78610" "78722" "78757" ...
$ NAME20 : chr "78653" "78610" "78722" "78757" ...
$ LSAD20 : chr "Z5" "Z5" "Z5" "Z5" ...
$ ALAND20 : num 2.73e+08 2.51e+08 3.51e+06 1.28e+07 4.55e+08 ...
$ AWATER20 : num 872369 918933 0 0 1116065 ...
$ geometry :sfc_MULTIPOLYGON of length 68; first list element: List of 2
..$ :List of 1
.. ..$ : num [1:6, 1:2] -97.6 -97.6 -97.6 -97.6 -97.6 ...
..$ :List of 1
.. ..$ : num [1:448, 1:2] -97.6 -97.6 -97.6 -97.6 -97.6 ...
..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
- attr(*, "sf_column")= chr "geometry"
- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA
..- attr(*, "names")= chr [1:7] "ZCTA5CE20" "AFFGEOID20" "GEOID20" "NAME20" ...
- attr(*, "tigris")= chr "zcta"
skim(tig_zips)Warning: Couldn't find skimmers for class: sfc_MULTIPOLYGON, sfc; No
user-defined `sfl` provided. Falling back to `character`.
| Name | tig_zips |
| Number of rows | 68 |
| Number of columns | 8 |
| _______________________ | |
| Column type frequency: | |
| character | 6 |
| numeric | 2 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| ZCTA5CE20 | 0 | 1 | 5 | 5 | 0 | 68 | 0 |
| AFFGEOID20 | 0 | 1 | 14 | 14 | 0 | 68 | 0 |
| GEOID20 | 0 | 1 | 5 | 5 | 0 | 68 | 0 |
| NAME20 | 0 | 1 | 5 | 5 | 0 | 68 | 0 |
| LSAD20 | 0 | 1 | 2 | 2 | 0 | 1 | 0 |
| geometry | 0 | 1 | 830 | 10835 | 0 | 68 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| ALAND20 | 0 | 1 | 98762692 | 125048235 | 1475441 | 22687804 | 41116413.5 | 110644621 | 521452503 | ▇▁▁▁▁ |
| AWATER20 | 0 | 1 | 1867545 | 4648935 | 0 | 0 | 257508.5 | 1117435 | 26097562 | ▇▁▁▁▁ |
Important to note that the dataset has several date fields, and that alone will be ripe with opportunity for exploration. However I want to take a look at some of the other variables first, to see if maybe isolating to one of the other categories, or including other categorical variables in the time series could be useful. First off is Zip Code, because location is often a major factor in determining the amount and type of crime that would occur.
d1 %>%
mutate(years = year(Occurred.Date),
years = as.character(years)
) %>%
group_by(years, Zip.Code) %>%
summarize(
count = n()
) %>% as.data.frame() %>% pivot_wider(names_from = years, values_from = count)`summarise()` has grouped output by 'years'. You can override using the
`.groups` argument.
# A tibble: 69 × 23
Zip.Code `2003` `2004` `2005` `2006` `2007` `2008` `2009` `2010` `2011`
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 76574 1 4 1 NA 2 6 NA 1 NA
2 78610 1 17 14 10 5 8 9 6 5
3 78612 1 NA 1 3 NA 2 NA NA NA
4 78613 379 435 301 343 338 470 461 442 431
5 78617 748 756 722 744 840 1035 1038 1084 1063
6 78620 2 1 5 2 3 1 1 NA NA
7 78634 2 NA NA 1 1 1 NA NA NA
8 78641 11 3 6 9 5 11 1 3 NA
9 78642 1 NA NA NA NA 1 NA NA NA
10 78645 4 1 5 3 4 4 NA 1 NA
# ℹ 59 more rows
# ℹ 13 more variables: `2012` <int>, `2013` <int>, `2014` <int>, `2015` <int>,
# `2016` <int>, `2017` <int>, `2018` <int>, `2019` <int>, `2020` <int>,
# `2021` <int>, `2022` <int>, `2023` <int>, `2024` <int>
Layering in the year gives an interesting view, because it gives us the opportunity to see major increases in certain locations over time. However, there are a lot of zip Codes and a lot of years in this dataset. A visualization may help with this, my first thought being a chloropeth plot with a slider for the year value. Other quicker solutions might be looking at the individual years as observations of sorts, then looking at summary statistics. More to come here.
d1 %>%
mutate(years = year(Occurred.Date)
#years = as.character(years)
) %>%
#filter(years == '2023' | years == '2022') %>%
#group_by(Family.Violence, Highest.Offense.Description, years) %>%
group_by(Highest.Offense.Description, years) %>%
summarize(cnt = n(), .groups = 'drop') %>%
#group_by(Highest.Offense.Description, years) %>%
#mutate(tot = sum(cnt)) %>%
#relocate(tot, .after = years) %>%
mutate(Highest.Offense.Description = as.factor(Highest.Offense.Description)) %>%
as.data.frame() %>%
arrange(years) %>%
pivot_wider(names_from = years, values_from = cnt) # A tibble: 436 × 23
Highest.Offense.Description `2003` `2004` `2005` `2006` `2007` `2008` `2009`
<fct> <int> <int> <int> <int> <int> <int> <int>
1 ABUSE OF OFFICIAL CAPACITY 1 5 2 2 4 2 1
2 AGG ASLT W/MOTOR VEH FAM/DA… 25 41 37 55 52 54 50
3 AGG ASSAULT 718 809 805 810 849 971 885
4 AGG ASSAULT FAM/DATE VIOLEN… 426 446 497 564 610 628 605
5 AGG ASSAULT ON PUBLIC SERVA… 26 25 16 25 14 11 24
6 AGG ASSAULT WITH MOTOR VEH 119 147 104 88 109 121 102
7 AGG FORCED SODOMY 5 2 2 2 3 3 3
8 AGG FORCED SODOMY OF CHILD 36 34 24 23 11 15 1
9 AGG KIDNAPPING 7 6 7 6 8 9 9
10 AGG PERJURY 1 NA NA NA NA NA NA
# ℹ 426 more rows
# ℹ 15 more variables: `2010` <int>, `2011` <int>, `2012` <int>, `2013` <int>,
# `2014` <int>, `2015` <int>, `2016` <int>, `2017` <int>, `2018` <int>,
# `2019` <int>, `2020` <int>, `2021` <int>, `2022` <int>, `2023` <int>,
# `2024` <int>
Similar to our zip code variable here, but there are far too many crime descriptions to use. Grouping some together may work, I don’t think we have enough information in the dataset overall to do any sort of classification modeling however, so it may amount to a rules based classification using regular expressions. Other variables may still help with this, such as the UCF classification variables.
d1 %>%
mutate(
Category.Description = if_else(
Category.Description == '', 'NA/Unknown', Category.Description
)
) %>%
group_by(Category.Description, Highest.Offense.Description) %>%
summarize(
cnt = n()
) %>%
pivot_wider(
names_from = Category.Description, values_from = cnt
)`summarise()` has grouped output by 'Category.Description'. You can override
using the `.groups` argument.
# A tibble: 436 × 9
Highest.Offense.Description `Aggravated Assault` `Auto Theft` Burglary Murder
<chr> <int> <int> <int> <int>
1 AGG ASLT ENHANC STRANGL/SU… 1098 NA NA NA
2 AGG ASLT STRANGLE/SUFFOCATE 7712 NA NA NA
3 AGG ASLT W/MOTOR VEH FAM/D… 772 NA NA NA
4 AGG ASSAULT 18634 NA NA NA
5 AGG ASSAULT BY PUBLIC SERV… 10 NA NA NA
6 AGG ASSAULT FAM/DATE VIOLE… 9603 NA NA NA
7 AGG ASSAULT ON PEACE OFFIC… 70 NA NA NA
8 AGG ASSAULT ON PUBLIC SERV… 329 NA NA NA
9 AGG ASSAULT WITH MOTOR VEH 1823 NA NA NA
10 ARSON WITH BODILY INJURY 15 NA NA NA
# ℹ 426 more rows
# ℹ 4 more variables: `NA/Unknown` <int>, Rape <int>, Robbery <int>,
# Theft <int>
The UCF classifiers are much more consumable, but there are more incidents that are not classified than are. That said, this might be a good start for a rules based classification. Many of the unclassified incidents would not fit very nicely into the groups anyway, so some new categories may need to be made. I may or may not come back to the re-classification part, but ultimately the genesis of this project was with reports on the amount of homicide in the city, so the already existing descriptions may be enough to drill down into just those.
reminder to myself to save figures from this file to bring into manuscript and presentation later.
Data exploration through figures
Time is a variable that is much easier to interpret with a plot, so let’s start there. Time series first.
dayline <- d1 %>% group_by(Occurred.Date) %>%
summarize(
cnt = n()
) %>% as.data.frame() %>%
ggplot(mapping = aes(x=Occurred.Date, y=cnt)) +
geom_line() +
labs(title = 'Austin Crime Over Time', x = 'Time of Occurence (by day)', y = 'Count') +
theme(panel.background = element_rect(fill = 'white',
color = 'white'),
plot.background = element_rect(fill = 'white',
color = 'white')
)
dayline
ggsave(here('results', 'figures', 'static-plots', 'crime-by-day.png'), dayline)Saving 7 x 5 in image
I expected aggregating by day to actually look worse than this. There is a clear trend, upwards at first but then starts decreasing around 2008 or so. Seeing this I am tempted to looking into time series modeling. There may be a seasonal pattern in some of those spikes, but that may be something that can be resolved with different methods. I’m tempted to go ARIMA, because frankly that’s the one I’m familiar with, but I’ll have to do some reading up on my options. An interesting idea might be to fit the model, make predictions, and then see where I landed compared to the actual data before the class ends.
monthline <- d1 %>% filter(Occurred.Date <= '2024-03-31') %>%
mutate(Occurred.Date = trunc.Date(Occurred.Date, 'months')) %>%
group_by(Occurred.Date) %>%
summarize(
cnt = n()
) %>% as.data.frame() %>%
ggplot(mapping = aes(x=Occurred.Date, y=cnt)) +
geom_line() +
labs(title = 'Austin Crime Over Time', x = 'Time of Occurence (by month)', y = 'Count') +
theme(panel.background = element_rect(fill = 'white',
color = 'white'),
plot.background = element_rect(fill = 'white',
color = 'white')
)
monthline
ggsave(here('results', 'figures', 'static-plots', 'crime-by-month.png'), monthline)Saving 7 x 5 in image
Monthly looks even better, again further lending itself to the idea of a time series model. The downward spikes are more prevalent here and may cause problems themselves as well, but again the seasonality may be able to be resolved with things like logs or differencing. As a note, in this and the last plot I filter out 2024, mostly due to a large drop at the end from the incomplete month, but also for the aforementioned idea of comparing predictions to actual results.
A potentially quick way to identify “peak” months for crime would be via a bar chart aggregating all occurrences in each month together. I was hoping to see some quick trends here, but I see surprisingly little here. The highest month is May, potentially because of school letting out/ graduations for UT Austin, but I wouldnt say the difference is large enough compared to other months to really point to any one reason. This may be another one to layer in with time, like a chart with a slider for the year to see shifts in crime frequency by month over time.
murdline <- d1 %>% filter(Occurred.Date <= '2024-03-31', Crime.Category == 'MURDER') %>%
mutate(Occurred.Date = trunc.Date(Occurred.Date, 'months')) %>%
group_by(Occurred.Date) %>%
summarize(
cnt = n()
) %>% as.data.frame() %>%
ggplot(mapping = aes(x=Occurred.Date, y=cnt)) +
geom_line() +
labs(title = 'Austin Murder Over Time', x = 'Time of Occurence (by month)', y = 'Count') +
theme(panel.background = element_rect(fill = 'white',
color = 'white'),
plot.background = element_rect(fill = 'white',
color = 'white')
)
murdline
ggsave(here('results', 'figures', 'static-plots', 'murder-by-time.png'), murdline)Saving 7 x 5 in image
ADD COMMMENTARY LATER: NUMBER OF FIRST QUARTER HOMICIDES MATCHES ARTICLE FOR 2020 AND 2023, BUT NOT 2024. ARTICLE LIKELY NOT COUNTING INTOX MANSLAUGHTER OR JUSTIFIED HOMICIDE. KEPT THOSE BECAUSE FBI CATEGORIZATION COUNTS MANSLAUGHTER.
d2 <- d1 %>%
mutate(
Occurred.Month = month(Occurred.Date, label = TRUE),
Occurred.Day = weekdays(Occurred.Date),
Occurred.Year = year(Occurred.Date),
Occur.Report.Diff = Report.Date - Occurred.Date
)
d2 %>%
#filter(Occurred.Year == 2023) %>%
ggplot(aes(x=Occurred.Month)) +
geom_bar()
Now a Zip Code bar chart. Really a bit of a graphical representation of what we saw earlier, though I filtered it to the two most recent complete years. Again, looking for big swings, and simultaneously taking a quick peak at the Family Violence indicator. Surprisingly low volume there, which leads me to believe it just doesn’t get captured appropriately every time, especially when comparing with the descriptions themselves, which we will see in a moment. Going back to the zip codes, there might be an opportunity to do something like binning the zip codes with high medium and low crime areas, or something of the sort.
d1 %>%
mutate(years = year(Occurred.Date),
years = as.character(years)
) %>%
filter(years == '2003' | years == '2022') %>%
group_by(Family.Violence, Zip.Code, years) %>%
summarize(cnt = n(), .groups = 'drop') %>%
group_by(Zip.Code, years) %>%
mutate(tot = sum(cnt)) %>%
relocate(tot, .after = years) %>%
mutate(Zip.Code = as.character(Zip.Code)) %>%
as.data.frame() %>%
arrange(desc(tot)) %>% head(100) %>%
ggplot(aes(fill=Family.Violence, x= reorder(Zip.Code,cnt), y=cnt)) +
geom_bar(position = 'stack', stat='identity') +
facet_wrap(~years) +
coord_flip()
Now a bar chart by crime description. This is similar to the last plot, but a little more disparate here. There are so many more crime descriptions than zip codes, so they are much more easily spread thin. That said, that actually makes the top five crime descriptions stand out a little more. So going back to the idea of creating new classifications, that may be a good idea to make sure those 5 are appropriately captured. And again, the family violence indicator is captured almost entirely by one description, yet the top description is Family disturbance. An easy explanation may be because the description is of the Highest Offense in the incident, so when there is physical family violence, it is often the highest offense.
d1 %>%
mutate(years = year(Occurred.Date),
years = as.character(years)
) %>%
filter(years == '2023' | years == '2022') %>%
group_by(Family.Violence, Highest.Offense.Description, years) %>%
summarize(cnt = n(), .groups = 'drop') %>%
group_by(Highest.Offense.Description, years) %>%
mutate(tot = sum(cnt)) %>%
relocate(tot, .after = years) %>%
mutate(Highest.Offense.Description = as.factor(Highest.Offense.Description)) %>%
as.data.frame() %>%
arrange(desc(tot)) %>% head(50) %>%
ggplot(aes(fill=Family.Violence, x= reorder(Highest.Offense.Description,cnt), y=cnt)) +
geom_bar(position = 'stack', stat='identity') +
facet_wrap(~years) +
coord_flip() 
yearlyCrime <- d1 %>% mutate(
Zip.Char = as.character(Zip.Code),
years = year(Occurred.Date),
years = as.character(years)
) %>% filter(years != 2024) %>%
group_by(Zip.Char, years) %>%
summarize(crim.count = n()) %>%
ungroup() %>%
as.data.frame()`summarise()` has grouped output by 'Zip.Char'. You can override using the
`.groups` argument.
#filter(count >= 500) %>%
#filter(Zip.Char == '78741') %>%
allYrsZips <- expand.grid(years = unique(yearlyCrime$years), Zip.Char = unique(yearlyCrime$Zip.Char))
yearlyCrimeAll <- left_join(allYrsZips, yearlyCrime, by = join_by(Zip.Char, years)) %>%
inner_join(tig_zips, by = join_by(Zip.Char == ZCTA5CE20)) %>%
mutate(crim.count = ifelse(is.na(crim.count), 0, crim.count))cpeth <- yearlyCrimeAll %>% filter(crim.count >= 50) %>%
ggplot() +
geom_sf(aes(geometry = geometry, fill = crim.count, frame = years, text = paste(Zip.Char, '<br>', crim.count))) +
scale_fill_gradient2(low = "#E0144C", mid = '#FFFFFF', high = "#000067", midpoint = -5000,
trans = 'reverse') +
theme_classic() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
line = element_blank(),
plot.title = element_text(hjust = 0, size = 15)) +
labs(fill = 'Crime Count',
title = 'Crime by Location Over Time')Warning in layer_sf(geom = GeomSf, data = data, mapping = mapping, stat = stat,
: Ignoring unknown aesthetics: frame and text
cplotly <- cpeth %>% ggplotly() %>%
layout(hoverlabel = text) %>%
animation_opts(1500, 1) %>%
animation_slider(currentvalue = list(visible = FALSE)) %>%
animation_button(x = 0, xanchor = "left", y = 0, yanchor = "bottom")
cplotlysaveWidget(frameableWidget(cplotly), here('results', 'figures', 'html-widgets', 'crime-location.html'))yearlyMurder <- d1 %>% mutate(
Zip.Char = as.character(Zip.Code),
years = year(Occurred.Date),
years = as.character(years)
) %>% filter(years != 2024, Crime.Category == 'MURDER') %>%
group_by(Zip.Char, years) %>%
summarize(murder.count = n()) %>%
ungroup() %>%
as.data.frame()`summarise()` has grouped output by 'Zip.Char'. You can override using the
`.groups` argument.
yearlyMurderAll <- left_join(allYrsZips, yearlyMurder, by = join_by(Zip.Char, years)) %>%
inner_join(tig_zips, by = join_by(Zip.Char == ZCTA5CE20)) %>%
mutate(murder.count = ifelse(is.na(murder.count), 0, murder.count))cpeth2 <- yearlyMurderAll %>% #filter(murder.count >= 1) %>%
ggplot() +
geom_sf(aes(geometry = geometry, fill = murder.count, frame = years)) +
#geom_sf_label(aes(label = paste(Zip.Char, '<br>', murder.count))) +
scale_fill_gradient2(low = "#E0144C", mid = '#FFFFFF', high = "#000067",
trans = 'reverse') +
theme_classic() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
line = element_blank(),
plot.title = element_text(hjust = 0, size = 15)) +
labs(fill = 'Murder Count',
title = 'Murder by Location Over Time')Warning in layer_sf(geom = GeomSf, data = data, mapping = mapping, stat = stat,
: Ignoring unknown aesthetics: frame
cplotly2 <- cpeth2 %>% ggplotly() %>%
layout(hoverlabel = text) %>%
animation_opts(1500, 1) %>%
animation_slider(currentvalue = list(visible = FALSE)) %>%
animation_button(x = 0, xanchor = "left", y = 0, yanchor = "bottom")
cplotly2catYearly <- d1 %>% mutate(
years = year(Occurred.Date)
#years = as.character(years)
) %>% filter(years != 2024) %>%
group_by(Crime.Category, years) %>%
summarize(crim.count = n()) %>%
ungroup() %>%
as.data.frame()`summarise()` has grouped output by 'Crime.Category'. You can override using
the `.groups` argument.
allYrsCats <- expand.grid(years = unique(catYearly$years), Crime.Category = unique(catYearly$Crime.Category))
yearlyAllCats <- left_join(allYrsCats, catYearly, by = join_by(Crime.Category, years)) %>%
mutate(crim.count = ifelse(is.na(crim.count), 0, crim.count))
catbar <- yearlyAllCats %>%
ggplot() +
geom_bar(aes(y= fct_reorder(Crime.Category, crim.count), x = crim.count, frame = years), stat = 'identity') +
labs(x = 'Count of Offense', y = 'Offense Description', title = 'Frequncy of Crim by Offense Category')Warning in geom_bar(aes(y = fct_reorder(Crime.Category, crim.count), x =
crim.count, : Ignoring unknown aesthetics: frame
#scale_x_continuous(transform = 'log')
catbar
fig <- plot_ly(yearlyAllCats,
x = ~crim.count,
y = ~fct_reorder(Crime.Category, crim.count),
type = 'bar',
showlegend = F,
frame = ~years,
marker = list(color = '#000067')
)
catbarly <- fig %>%
layout(
yaxis = list(
title = '', tickangle = -30, tickfont = list(size = 8)
),
xaxis = list(
title = 'Count of Occurence'
),
title = list(
text = 'Yearly Occurence of Crime by Category',
y = 0.99, x = 0.1, xanchor = 'left', yanchor = 'top',
font = list(
size = 18
)
)
)
catbarlysaveWidget(frameableWidget(catbarly), here('results', 'figures', 'html-widgets', 'crime-cat.html'))First table is intended to act as a translation key between description and category, but also serves well to illustrate the motivation behind the categorizations.
RecatTable <- d1 %>%
filter(year(Occurred.Date) < 2024) %>%
mutate(
Occur.Years = trunc.Date(Occurred.Date, 'years')
) %>%
group_by(Crime.Category, Highest.Offense.Description, Occur.Years) %>%
summarize(
Count = n(),
.groups = 'drop'
) %>%
arrange(by = Occur.Years) %>%
group_by(Crime.Category, Highest.Offense.Description) %>%
summarize(
mean = mean(Count),
sd = sd(Count),
sum = sum(Count),
.groups = 'drop'
) %>%
gt() %>%
tab_header(
title = md('**Crime Recategorization Key**')
) %>%
cols_label(
Crime.Category = md('**New Category**'),
Highest.Offense.Description = md('**Original Description**'),
mean = md('**Mean Occurrence Across Years**'),
sd = md('**Standard Deviation of Yearly Occurrence**'),
sum = md('**Total Occurrences**')
) %>%
fmt_number(columns = mean:sd, decimals = 1) %>%
opt_interactive()
gtsave(RecatTable, 'recat-table.html', here('results', 'figures', 'html-widgets'))
#saveWidget(frameableWidget(RecatTable), here('results', 'figures', 'html-widgets', 'recat-table.html'))
RecatTableCrime Recategorization Key
Plot highlights the advantages in the categorization, with inclusion of sparklines for clarity on movement over time.
newCatTable <- d1 %>%
filter(year(Occurred.Date) < 2024) %>%
mutate(
Occur.Years = trunc.Date(Occurred.Date, 'years')
) %>%
group_by(Crime.Category, Occur.Years) %>%
summarize(
Count = n(),
.groups = 'drop'
) %>%
arrange(by = Occur.Years) %>%
group_by(Crime.Category) %>%
summarize(
mean = mean(Count),
sd = sd(Count),
list = list(Count),
.groups = 'drop'
) %>%
gt() %>%
tab_header(
title = md('**Crime Categories Over Time**')
) %>%
cols_label(
Crime.Category = md('**Crime Category**'),
mean = md('**Mean Occurrence Across Years**'),
sd = md('**Standard Deviation of Yearly Occurrence**'),
list = md('**Change Over Time**')
) %>%
fmt_number(columns = mean:sd, decimals = 1) %>%
gtExtras::gt_plt_sparkline(column = list, type = 'ref_iqr', same_limit = FALSE) %>%
tab_footnote('Note: Sparklines have independent axes, comparisons should not be considered across different categories')
gtsave(newCatTable, 'new-cat-table.html', here('results', 'figures', 'html-widgets'))Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 15 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 8 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
newCatTableWarning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 15 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 8 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Warning in geom_segment(aes(x = min(.data$x), y = stats::median(.data$y), : All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 21 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
Crime Categories Over Time |
|||
|---|---|---|---|
Crime Category |
Mean Occurrence Across Years |
Standard Deviation of Yearly Occurrence |
Change Over Time |
| ABUSE OF OFFICE | 6.4 | 3.9 | |
| AGGRAVATED ASSAULT | 2,029.0 | 411.7 | |
| ALCOHOL RELATED | 4,876.4 | 1,366.1 | |
| AUTO THEFT | 2,822.5 | 1,233.1 | |
| BRIBERY | 1.3 | 0.8 | |
| BURGLARY | 6,135.7 | 1,565.7 | |
| CRIMINAL CONSPIRACY | 29.8 | 16.1 | |
| CRIMINAL MISCHIEF | 7,827.0 | 1,735.4 | |
| DISORDERLY CONDUCT | 1,159.0 | 254.1 | |
| DRUG RELATED | 6,058.4 | 1,811.2 | |
| FINANCIAL CRIME | 1,458.5 | 571.2 | |
| FRAUD | 2,805.2 | 459.4 | |
| GAMBLING RELATED | 40.7 | 26.8 | |
| GENERAL DISTURBANCE | 14,422.1 | 2,150.4 | |
| HARASSMENT | 3,513.2 | 1,037.9 | |
| HARM OF VULNERABLE PERSONS | 1,173.2 | 321.2 | |
| JUSTIFIED HOMICIDE | 5.4 | 3.1 | |
| LARGE SCALE THREAT | 1,061.8 | 194.2 | |
| LITTERING | 132.8 | 104.5 | |
| MURDER | 40.0 | 16.2 | |
| NON-COMPLIANCE | 5,744.0 | 2,966.8 | |
| OTHER | 920.3 | 174.7 | |
| PROTECTIVE ORDER | 980.9 | 372.2 | |
| RAPE | 741.7 | 150.3 | |
| ROBBERY | 1,082.8 | 201.3 | |
| SEX OFFENSE | 687.4 | 119.0 | |
| SEX OFFENSE INVOLVING A CHILD | 290.6 | 84.4 | |
| SIMPLE ASSAULT | 10,780.3 | 812.8 | |
| THEFT | 32,048.1 | 4,129.8 | |
| TRESPASSING | 2,846.7 | 1,236.5 | |
| UNLAWFUL RESTRAINT | 59.5 | 11.5 | |
| VOCO | 3,251.1 | 2,688.0 | |
| WEAPON RELATED | 548.4 | 109.8 | |
| Note: Sparklines have independent axes, comparisons should not be considered across different categories | |||